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Abstract 

We study the problem of the emergence of cooperation in the spatial Prisoner's Dilemma. 
The pioneering work by Nowak and May (1992) showed that large initial populations of 
cooperators can survive and sustain cooperation in a square lattice with imitate-the-best 
evolutionary dynamics. We revisit this problem in a cost-benefit formulation suitable for 
a number of biological applications. We show that if a fixed-amount reward is established 
for cooperators to share, a single cooperator can invade a population of defectors and form 
structures that are resilient to re-invasion even if the reward mechanism is turned off. We 
discuss analytically the case of the invasion by a single cooperator and present agent-based 
simulations for small initial fractions of cooperators. Large cooperation levels, in the sus- 
tainability range, are found. In the conclusions we discuss possible applications of this 
model as well as its connections with other mechanisms proposed to promote the emer- 
gence of cooperation. 
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1 Introduction 



The emergence of cooperative behavior among unrelated individuals is one of 
the most prominent unsolved problems of current researc h (iPennisii 2005). W hile 
such non-kin cooperation is evident in human societies jHammersteinl 2003 ). it 
is by no means exclusive o f them, and can be observed in many dif ferent species 
(Doe beli and Hauert. 2005) down to the level of microorganisms (|Velicer 



2003 



Wingreen and LevinLl2006j). This conundrum can be suitably formulated in terms of 



evolutionary game th eory (IMaynard-Smithlll982l : lGintisll2000l:lNowak and Sigmundl 



2004; Nowak, 2006) by studying games that are stylized versions of social dilem- 
mas (IKollocki LL998), e.g., situations in which individually reasonable behavior 
leads to a situation in which everyone is worse off than they might have been other- 
wise. Paradigmatic examples of these dilemmas are the provis ion of public goods 
( Samuelson , 1954), the tragedy of the commo ns (|Hardin , 1968 ). and the Prisoner's 
Dilemma (PD) ( Axelrod and Hamilton!. 1 198~l|) . The first two of them involve multi- 
ple actors, while the latter involves only two actors, this last case being the setting 
of choice for a majority of models on the evolution of cooperation. 



The PD embodies a stringent form of social dilemma, namely a situation in which 
individuals can benefit from mutual cooperation but they can do even better by 
exploiting cooperation of others. To be specific, the two players in the PD can adopt 
either one of two strategies: cooperate (C) or defect (D). Cooperation results in a 
benefit b to the opposing player, but incurs a cost c to the cooperator (where b > 
c > 0). Defection has no costs and produces no benefits. Therefore, if the opponent 
plays C, a player gets the payoff b — c if she also plays C, but she can do even better 
and get b if she plays D. On the other hand, if the opponent plays D, a player gets 
the lowest payoff — c if she plays C, and it gets if she also defects. In either case, 
it is better for both players to play D, in spite of the fact that mutual cooperation 
would yield higher benefits for them, hence the dilemma. 



Conflicting situations that can be described by the PD, e ither at the level of indi- 
viduals or at the level of populations are ubiquitous. Thus. lTurner and Chad (11999|) 
showed that interactions between RNA phages co-infecting bacteria are governed 
by a PD. Escherichia coli station ary phase GASP mutan ts in starved cultures are 
another example of this dilemma (|Vulic and Kolterl 1200 1|) . A PD also arises when 
different yeasts compet e by switching from respir ation to respirofermentation when 
resources are limited (prick and Schusteii 120031) . Hermaphroditic fish that alter- 
nately release sperm and eggs end up involved in a PD with cheaters that re lease 
only sperm with less metabolic effort ( Dugatkin and Mesterton-Gibbons , 1996|) . A 
recent study of cooperative territorial defence in lio ns (Panthera leo), described 
the correct ranking structure for a PD (|Legge . 1996). And, of course, the PD ap- 
plies to very many different situations of inte ractions between human individuals 
or collectives ( Axelrodl I1984 ICamereii 12003 ). 
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In view of its wide applicability, the PD is a suitable context to pose the ques- 
tion of the emergence of cooperation. How do cooperative individuals or popula- 
tions survive or even thrive i n the context of a PD, where defectin g is the only 
evolutionarily stable strategy ( Maynard-Smith L 1982 ^ Nowak , l2006h ? Several an- 
swers to this puzzle have been put forward ( Nowakl. 2006b am ong which the most 
relevant examples are kin selection theory ( Hamilton, 1964), reciprocal altruism 



or direc t reciprocity (iTriversl Il971l: lAxelrod and HamiltonL |1981[) . indirect reci- 



procity (No wak and S igmund. 1998), emergence of cooperation through punish- 
ment (IFehr and Gachteil 120021) or th e existence of a spatial or social structure of 
interactions (Nowak and Mayl . Il992|) . This last approach has received a great deal 
of attention in the last decade and has proven a source of important insights into the 
evolution of cooperation (see ISzabo and Fathl (|2007l) for a recent and comprehen- 
sive review). One such insight is the fact that cooperators can outcompete defectors 
by forming clusters where they help each other. This result, in turn, leaves open the 
question of the emergence of coo peration in a populati on with a majority of defec- 
tors. Recently, it has been shown (|Ohtsuki et al.l |2006|) that, if the average number 
of connections in the interaction network is k, the condition b/c > k implies that se- 
lection favors cooperators invading defectors in the weak selection limit, i.e., when 
the contribution of the game to the fitness of the individual is very small. However, 
a general result valid for any intensity of the selection is still lacking. 



In this paper, we propose a new mechanism for the emergence of cooperation, 
which we call shared reward. In this setting, players interact through a standard 
PD, but in a second stage cooperators receive an additional payoff coming from 
a resource available only to them and not to defectors. It should be emphasized 
that similar reward mechanisms may be relevant for a number of specific appli- 
cations, such as mutualistic situations with selection imposed b y hosts rewarding 
cooperation or punishing less cooperative behavior (see, e.g., iKiers et al\ (I2003|) 
and references therein). Another cont ext that may be modelled by our approach 
is team formatio n in animal societies ([Anderson and Franks! . 1200 1|) . e.g., in coop- 
erative hunting (|Packer and Ruttanl .ll988). On the other hand, the idea of a shared 
reward could be implemented in practice as a way to promote cooperation in human 
groups or, alternatively, may arise from costly signaling prio r to the game, when the 
exchange of cooperative signals among cooperators is free (|Skyrms . 2004|) . As we 
will see, this scheme makes it possible for a single cooperator to invade a popula- 
tion of defectors. Furthe rmore, when strategies evolve by unconditional imitation 
(INowak and Mayl . ll992|) . cooperation persists after the additional resource has been 
exhausted or turned off. We present evidence for these conclusions coming from 
numerical simulations on a regular network. In the conclusion, we discuss the rea- 
son for this surprising result and the relation of our proposal to previous work on 
evolutionary games on graphs and to public goods games. 
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2 Spatial Prisoner's dilemma with shared reward 



Our model is defined by a two-stage game on a network. In the first stage, players 
interact with their neighbors and gobtained payoffs as prescribed by the PD game, 
whose payoff matrix in a cost-benefit context is given by 



(1) 



Subsequently, in the second stage of the game, a fixed amount p is distributed 
among all cooperators. It is important to realize at this point that such a two-stage 
game is only interesting in a population setting: in a two-player game, the second 
stage would amount to shift the cooperator's payoff by p/2 or p, depending on the 
opponent's strategy. Then, for p < c we would simply have anot her PD, whereas 
for 2c > p > c we would have the Hawk-Dove or Snowdrift game (M aynard-Smith . 




1982|) . and for p > 2c we would have the trivial Harmo ny game (also called Byprod 



uct Mutualism (|Dugatkin et al.L 1 19921 : IConnoii 11995). In a population setting, the 



amount received by a cooperator depends on the number of cooperators in the total 
population and is therefore subject to evolution as the population itself changes. 

In order to write down the payoffs for the game after the second stage, we need to 
introduce some notation. Let us consider a population of N players, each of whom 
plays the game against k other players. For player i, 1 < i < N, let us denote by V ; 
the number of cooperators among the opponents of i, and by N c the total number of 
cooperators in the population. The payoffs can then be written as follows: 

p 

Vjb — kc-\ , if i cooperates 

N c (2 ) 

Vib, if i defects. 



This mechanism to reward cooperation has been studied by ICuesta et al. ( 2007 ) in 
a game theoretical model of n players with no spatial structure. As stated above, our 
goal here is to understand whether or not the mechanism of the shared reward can 
explain the emergence of cooperation in the Prisoner's Dilemma on networks. To 
address this problem, we will consider below this game in the framew ork of a spa- 
tial setup following the same general lines as LNowak and Mayl (|1992|) for compari- 
son. We place N individuals on a square lattice with periodic boundary conditions, 
each of whom cooperates or defects with her neighbors (4, von Neumann neighbor- 
hood). We have chosen this neighborhood for the sak e of simplicity in the ca lcula- 
tion; results for Moore neighborhood [used, e.g., by iNowak and Mayl (| 19921) 1 can 
be obtained in a straightforward manner. After receiveing their payoffs according 
to (2), all individuals update their strategy synchronously for the next round, by 
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imitate-the-best (also called unconditional imitation) dynamics: they look in their 
neighborhood for players whose payoff is higher than their own. If there is any, 
the player adopts the strategy that led to the highest payoff among them (randomly 
chosen in case of a tie). We then repeat the process and let the simulation run until 
the density of cooperators in the lattice reaches an asymptotic average value or else 
it becomes or 1 (note that these two states, corresponding to full defection and full 
cooperation, are absorbing states of the dynamics because there are not mutations). 



From the work by iNowak and Mayl (|1992|) . we know that if we begin the Simula 



tion with a sufficiently large cooperator density, then the lattice helps sustain the 
cooperation level by allowing cooperators interacting with cooperators in cluster 
to survive and avoid invasion by defectors; defectors thrive in the boundaries be- 
tween cooperator clusters. What we are i nterested in is in the ques tion as to how the 
large initial cooperator level required by Nowak and Mayl (|l992|) may arise; if the 
initial number of cooperators is small, they cannot form clusters and full defection 
is finally established. On the other hand, another relevant point is resilience, i.e., 
the resistance of the cooperator cluster to re -invasion by defe c tors. I n this respect, 



we note that while the clusters obtained by INowak and Mayl (|1992|) did show re- 
silience, their corresponding cooperation level was not large. As we will see below, 
the mechanism we are proposing will lead to higher cooperation levels with good 
resilience properties, even for medium costs. To address these issues, we begin by 
discussing the invasion by a single cooperator placed on the center of the lattice 
(in fact, on any site, as the periodic boundary conditions make all sites equiva- 
lent). This, along with the possible scenarios of invasion by a single defector, will 
lead to a classification of the different regimes in terms of the cost parameter. Sub- 
sequently, we will carry out simulations with a very low initial concentration of 
cooperators. 



3 Invasion by a single cooperator and resilience of cooperation 



As our strategy update rule is unconditional imitation, the process is fully deter- 
ministic, so we can compute analytically the evolution of the process. Thus, for 
the first cooperator, seeded at time t = 0, to transform her defector neighbors into 
new cooperators, it is immediate to see that p > b + 4c; otherwise, the cooperator 
is changed into a defector and the evolution ends. If the condition is satisfied, the 
four neighbors become cooperators, and we have now a rhomb centered on the site 
of the initial cooperator. In what follows, we discuss the generic situation in the 
subsequent evolution of the system. 

After the initial cooperator has given rise to a rhomb, there will always be four 
types of players in the system: 

• The cooperators in the bulk, that interact with another four cooperators. 
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• The cooperators in the boundary, defined as the set of cooperators that have links 
with defectors. These boundary players have two cooperator neighbors or only 
one if they are at the corners of the rhomb, but the key point is that they are 
always connected to a cooperator that interacts only with cooperators. 

• The defectors in the boundary, that interact with one (opposite to the corners of 
the rhomb) or two cooperators. 

• The defectors in the bulk, that interact with another four defectors. 



For the rhomb to grow two conditions must be met: first of all, the payoff obtained 
by the boundary cooperators at the corner (b — Ac plus the reward contribution) 
has to be larger than that of the boundary defectors with only one cooperator (b); 
secondly, the payoff obtained by cooperators that have two cooperator neighbors 
(2b — Ac plus the reward contribution) has to be larger than that of the boundary 
defectors that interact with two cooperators (2b). If both conditions are verified, 
defectors are forced to become cooperators by imitation. Therefore, we must have 

b-Ac + — ^— >b and 2b-Ac + — ^— > 2b — ^— > Ac. (3) 

N c (t) N c (t) N c (t) 

We thus find that the condition for invasion does not depend on the benefit b. In 
addition, it predicts that invasion proceeds until the rhomb contains too many co- 
operators so that the condition is not fulfilled anymore. In view of this result, we 
find it convenient to introduce a parameter to measure the reward in terms of the 
cost: 

With this notation, the prediction for the invasion by a single cooperator is that 
it will proceed as long as the fraction of cooperators verifies N c (t)/N < 8. N c (t), 
the number of cooperators at time t, can be easily determined from the recurrence 
relation for the growing rhomb: in case the cooperators increase, a new boundary 
layer is added to the rhomb, and we have N c (t) = N c (t — 1) +At, which can be 
immediately solved (with initial condition N c (0) = 1) to give N c (t) = 2t 2 + 2t + 
1. Inserting this result in the above condition allows to determine the maximum 
growth time for the cluster, that is t* = max{t :2t 2 + 2t + l < 8N}, and the fraction 
of cooperators in the steady state: 

So far, we have seen that when the reward is large enough (p > AcN), full cooper- 
ation sets in, whereas for smaller reward, a cooperator cluster grows up to a final 
size that depends on 8. Interestingly, when b/2 > c, the reward mechanism is only 
needed to establish an initial population of cooperators, i.e., the rhomb is resilient. 
To show this, notice that boundary cooperators observe the defectors that earn the 
largest payoff (those with two links to two cooperators) and compare it with the 
payoff obtained by bulk cooperators; boundary cooperators are linked to both and 
unconditional imitation will lead them to adopt the strategy of the neighbor with 
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Fig. 1. Final stage of the invasion of a cooperator population by a single defector, for the 
medium cost case b/4 < c < b/2. Defectors are white, cooperators are black. 



the largest payoff. The condition for the cooperators to resist re-invasion is then 

4{b-c) > 2b c< -. (6) 

Indeed, if after a number of time steps we turn off the reward, the rhomb structure 
arising from the evolutionary process cannot be re-invaded by defectors, as can be 
seen from Eq. (6). In the opposite case, c > b/2, the reward must be kept at all 
times to stabilize the cooperator cluster. 

In order to study the resilience of clusters of cooperators, we consider the simplest 
case of invasion by a single defector in the Prisoner's Dilemma (wit hout reward) . 



It can be easily shown that this leads to three different cost regimes ([Jimenez et al. 



2007) 



Low cost case, c < b/4: the defector is only able to invade its 4 neighbors, giving 
rise to a 5 defector rhomb. 

Medium cost case, b/4 < c < b/2: a structure with the shape of a cross with 
sawtooth boundaries is formed, implying a finite density of defectors in the final 
state (cf. Fig. 1). 

High cost case, c > b/2: the system is fully invaded by the defector, and cooper- 
ators go extinct. 
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4 Simulations with an initial concentration of defectors 



After considering the case of the invasion of a defecting population by a single 
cooperator, we now proceed to a more general situation in which there appear a 
number of cooperators randomly distributed on the lattice. To this end, we have 
carried out simulations on square lattices of size N = 100 x 100 for different ini- 
tial numbers of cooperators as a function of the cost parameter (we take b = 1 for 
reference) and the reward. A single simulation consists of running the game until 
a steady state is reached, as shown by the fraction of cooperators becoming ap- 
proximately constant. Generally speaking, the steady state is reached in some 100 
games per player. For every choice of parameters, we compute an average over 100 
realizations of the initial distribution of the cooperators. Results are shown in Fig. 
2 for low, medium and high costs. 



Figure 2 shows a number of remarkable features. To begin with, the case of invation 
by a single cooperator reproduces the analytical result (5), On the other hand, in all 
three plots we see that if instead of a single cooperator we have an initial density 
of cooperators, the resulting level of cooperation is quite higher, particularly when 
costs are low. Indeed, by looking at panel a), for which c = 0.2 (b = 1), we see 
that with a 1 0% of initial cooperator s cooperation sets in even without reward, as 
observed by iNowak and Mayl (I1992I) . Notwithstanding, a more remarkable result 
is the fact that with an initial density of cooperators as low as 0.1% we find large 
cooperation levels for small rewards, for all values of costs. Clearly, the coopera- 
tion level decreases with increasing cost, but even for high costs [panel c), c = 0.7], 
the cooperation level is significantly higher than the single cooperator one. In this 
last case, we also observe that the final state becomes practically independent of 
the density of initial cooperators. Finally, an intriguing result is that in the low cost 
case, the observed cooperation fraction is not a monotonically increasing function 
of the reward: As it can be seen from the plot, for moderate and particularly for 
large initial densities of cooperators, increasing the reward may lead to lower levels 
of cooperation. The reason for this phenomenon is that, if the reward increases, the 
cooperator clusters arising from the cooperator invaders grow larger and overlap. 
Therefore, clusters with rugged boundaries are formed, allowing for defectors with 
three cooperators which may then be able to reinvade. Further increments of the 
reward restore the cooperation levels because then even these special defectors are 
overrode. The important consequence is that one cannot assume that, for any situ- 
ation, increasing the reward leads to an increasing of the cooperation, i.e., one has 
to be careful in designing the reward for each specific application. 



The other relevant issue to address in the simulations is the resilience of the attained 
cooperation levels. Figure 2 summarizes our results in this regard. Both for low 
and high rewards, we confirm the result for the single cooperator invasion that 
cooperation disappears if the reward is turned off when the costs are high (c = 
0.7 > b/2). For moderate and low costs, the structures arising from the evolution 
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Fig. 2. Average fraction of cooperators in the steady state as a function of the rescaled 
reward 8 = p/4Nc, obtained starting with 1 (+), 10 (*), 100 (o), and 1000 (o) initial coop- 
erators. a) low cost, c = 0.2; b) medium cost, c = 0.4; c), high cost, c = 0.7. 

with reward do show resilience, at least to some degree. Interestingly, the case of 
low reward [panel a)] gives rise to extremely robust cooperation levels, whereas 
higher rewards [panel b)] lead to structures for which cooperation decreases when 
the reward is absent (medium cost case). This result is connected with the one 
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a) b) 




time time 

Fig. 3. Time evolution of the fraction of cooperators for the cases of low (dot-dashed line, 
c = 0.2), medium (solid line, c = 0.4) and high (dashed line, c = 0.7) costs, for simulations 
starting with 100 initial cooperators (density, 1%) randomly distributed. Shown are the 
cases of a) low (8 = 0.1) and b) high (§ = 0.5) rewards. Reward is set in place until t = 100 
and turned off afterwards. 

a) b) 




Fig. 4. System snapshots at the stationary state of a single realization of the evolution 
(before switching off the reward, see Fig. 3) for the low reward case (8 = 0.1). The initial 
density of cooperators is 1%. a) low cost (c = 0.2), b) medium cost (c = 0.4). Defectors are 
white, cooperators are black. 

already discussed that the cooperation level may not be monotonies in the reward, 
and makes it clear that structures originating from a very agressive, high reward 
policy may be less resilient than those built with low rewards. 

Further insight on the cluster structure arising from the invasion process fueled 
by the reward can be gained from Figs. 4 and 5. Figure 4 shows the stationary 
structure of the cooperator clusters for the low reward case (8 = 0. 1). As we are now 
considering that the initial configuration contains a 1% of cooperators randomly 
distributed, the shapes are irregular, and some rhombs are larger than others because 
they merge during evolution. In accordance with Fig. 3, in the low cost situation the 
cooperation level reached is much larger than in the medium cost case. However, 
both structures are resilient and survive unchanged if the reward is removed. This 
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Fig. 5. System snapshots at the stationary state of a single realization of the evolution, a) 
before and b) after switching off the reward, see Fig. 3) for the high reward case (8 = 0.5) 
and medium cost (c = 0.4). Defectors are white, cooperators are black. 

is due to the fact that, as discussed above, in that case defectors can never invade 
a cooperating population. The final structure for the high cost case is similar to 
Fig. 4b), but in this case suppression of the reward leads to an immediate invasion 
by defectors until they occupy the whole system. When the reward is larger, the 
situation is somewhat different, as can be appreciated from Fig. 5. While for low 
cost we again obtain resilient structures that are preserved even without reward, in 
the medium cost regime the patterns change. Panel a) shows the stationary state 
reached with the reward; when the reward is taken away, the state changes and 
evolves to the configuration shown in panel b). What is taking place here is that due 
to the high reward, a cooperation level close to 1 is reached, most of the defectors 
being isolated or along lines. When the reward is switched off, these defectors 
are in a position to rip much payoff from their interactions with the cooperators, 
allowing for a partial reinvasion. Therefore, the final cooperation level has more or 
less halved. We stress that even then the cooperation level that remains after the 
suppression of the reward is rather large (about 60%), another hint of the efficiency 
of this mechanism to promote cooperation. 



5 Discussion and conclusions 



We have proposed a mechanism that allows a population of cooperators to grow 
and reach sizeable proportions in the spatial Prisoner's Dilemma in a cost-benefit 
framework. This mechanism is based in the distribution of a fixed-amount reward 
among all cooperators at every time step. With this contribution to the payoffs of the 
standard Prisoner's Dilemma, even a single cooperator is able to invade a fully de- 
fecting population. The resulting cooperator fraction is determined by the amount 
of the reward as compared to the total number of players and to the cost of the in- 
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teraction. Furthermore, for low and medium costs (c < b/2) cooperation is resilient 
in the sense that if at some time step the reward is suppressed, the cooperator clus- 
ter cannot be re-invaded by the defectors. Finally, we have seen that low rewards 
are capable to induce a very large cooperation level, so the mechanism works even 
when it changes only a little the payoffs of the Prisoner's dilemma. 



The result we have obtained is r elevant, in the fi r st plac e, as a necessary com- 
plement of the original work by iNowak and Mavl {[992) within the cost-benefit 
context. In their work they showed that the spatial structure allowed cooperator 
clusters to survive and resist invasion by defectors, but they began with a large 
population of cooperators. Our work provides a putative explan ation as to where 
this p opulation comes from. We note that in the original work by INowak and May 
(|1992|) they observed that the cooperation level decreased with respect to the initial 
population, so a mechanism leading to the appearance of high cooperator levels is 
certainly needed. In this regard, we want to stress that the reward mechanism gives 
rise to structures with very good resilience properties: Simulations without the re- 
ward show that starting from a randomly distributed population of cooperators with 
very large density (~ 90%), the final cooperation level is halved for low costs, and 
practically disappears for moderate costs. 



We stress that, to our knowledge, this is the first time that a mechanism based on 
a fixed-amount reward to be shared among cooperators is proposed. Notwithstand- 
ing, there are other proposals whi ch are somewhat r e lated to ou rs, most promi - 
nent among them being those by Lugo and Jimenez (|2006) and Hauert ([2006). 



Lugo and Jimenez! ((2006) introduce a tax mechanism in which everybody in the 



population contributes towards a pool that is subsequently distributed among the 
cooperators. This is different from the present proposal in so far as the contribution 
from the tax is not a fixed quanti ty but rather it increases with the average payoff. 
On the other hand, iHauertl ([2006) focuses on the effects of nonlinear discounts (or 
synergistic enhancement) depending on the number of cooperators in the groups of 
interactin g individuals. Altho ugh the corresponding game theoretical model, dis- 
cussed by lHauert et al. (2006) belongs to the same general class of n-player games 
of our shared rewa rd mode l , the s patial implementation of the two models is very 
different. Thus, in lHauert ( 2006|) . payoffs for a given individual depend on the 
number of cooperators in her neighborhood, whereas in the present work payoffs 
depend on the total number of cooperators in the network. On the other hand, our 
interest is also different, in so far as we are discussing a mechanism to foster the ap- 
pearance of an initial, sizeable population of cooperators which can later be stable 
without this additional resource. It is important to stress that with our mechanism 
a large level of cooperation can be established and (in the appropriate parameter 
range) stabilized. 



We believe that our results may be relevant for a number of experimental situa- 
tions where the Prisoner's Dilemma has been shown to appear in nature. Thus, the 
stabilization of mutualistic symbioses by rewards or sanctions as observed in, e.g., 
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legume-rhizobium mutualism (IKiers et all 120031) is related to the mechanism we 
are proposing here: It is observed that soybeans penalize rhizobia that fail to fix 
N2 in their root nodules. This decreases the defector's payoff which is similar to 
increasing the cooperator's payoff by a reward. On the other han d, a descriptio n 
of the interac tion between different strains of microorganisms fsee ICrespil (1200 



Velicerl (|2003f) and references therein] in terms of this reward mechanism instead 



of the standard Prisoner's Dilemma may prove more accurate and closer to the ac- 
tual interaction process. An example could be the evolu tion of cooperators with 
reduc ed sensitivity to defectors in the RNA Phage 4>6 (|Turner and Chaol 1 19991 



2003). Cooperative foraging is another context where the mechanism of rewarding 



cooperat ion may be rele vant, ranging fr om microorganisms su ch as Myxococcus 
xanthus ( Dworkin, 1996) through beetles (Be rryman et a/.lll985|) to wolves or lions 



Anderson and Franksl (|200ll) . Finally, the question arises as to the validity of such 



a mechanism to promote cooperation within humans, as individual players can not 
predict in advance the additional payoff they will obtain from the reward, and there- 
fore it is not clear whether it would have an influence on them or not. Evidences 
from cooperative hunting in humans dAlvardl l200ll : lAlvard2i 120031) show that high 
levels of sharing help sustain cooperative behavior. However, in the human case, 
contexts where the reward would be more explicitly included in a manner transpar- 
ent to the players are possible and amenable to experiments. Research along these 
lines is necessary to assess the possible role of the reward mechanism in specific 
situations. 
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